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A class of 7i-estimators based on the concepts of multivariate 
signed ranks and the optimal rank-based tests developed in Hallin 
and Paindaveine [Ann. Statist. 34 (2006)] is proposed for the esti- 
mation of the shape matrix of an elliptical distribution. These R- 
estimators are root-n consistent under any radial density g, without 
any moment assumptions, and semiparametrically efficient at some 
prespecified density /. When based on normal scores, they are uni- 
formly more efficient than the traditional normal-theory estimator 
based on empirical covariance matrices (the asymptotic normality of 
which, moreover, requires finite moments of order four), irrespective 
of the actual underlying elliptical density. They rely on an original 
rank-based version of Le Cam's one-step methodology which avoids 
the unpleasant nonparametric estimation of cross-information quan- 
tities that is generally required in the context of _R-estimation. Al- 
though they are not strictly afnne-equivariant, they are shown to be 
equivariant in a weak asymptotic sense. Simulations confirm their 
feasibility and excellent finite-sample performances. 

1. Introduction. 

1.1. Rank-based inference for elliptical families. An elliptical density over 
R fc is determined by a location center 6 S M. k , a scale parameter a G M^j" , a 
real- valued positive definite symmetric k x k matrix V= (Vij) with V\\ = 1, 
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the shape matrix, and the so-called standardized radial density g±; for a pre- 
cise definition and comments, see Section 1.2 of [13]. We shall hereafter refer 
to the latter as HP, further referring to Section HP1.2, Proposition HP2.3, 
Equation (HP4.5), etc. 

Elliptical families have been introduced in multivariate analysis as a re- 
action against pervasive Gaussian assumptions. Most classical procedures 
in that field — principal components, discriminant analysis, canonical cor- 
relations, multivariate regression, etc. — readily extend to elliptical models, 
with shape playing the role of covariances or correlations. When g\ is such 
that the corresponding distribution has finite second-order moments, V is 
proportional to the covariance matrix and shape-based procedures coincide 
with the classical covariance-based ones; unlike covariances, however, shape 
still makes sense in the absence of moment restrictions. In such a context, 
robust inference methods, resisting arbitrarily heavy radial tails, are highly 
desirable and distribution-free rank-based methods naturally come into the 
picture (see [9, 10, 11, 12] for closely related results). 

1.2. Rank tests. In the hypothesis-testing context, HP develop a class 
of semiparametrically optimal signed rank tests for null hypotheses of the 
form V = Vo (6, a and g\ playing the role of nuisances). Let Xi, . . . ,X n 
be a random sample from some elliptical distribution characterized by 6, 
a, V and g\. Assuming that 9 is known (in practice, this can be re- 
placed by any root-n consistent estimate 6 — see Section HP4.4), denote by 

— 1/2 

Zj := V (Xj — 6) the 0-centered, Vo-standardized observations. Define 
the rank Ri as the rank of di := ||Zj|| among d\ % . . . ,d n and the multivariate 
sign Uj as ||Zj||~ 1 Zj, i = 1, . . . ,n. Considering the matrix- valued signed rank 
statistic 



where : (0, 1) — > M is the score function ensuring optimality at f±, the 
test statistic developed in HP takes the very simple form [see (HP4.4)] 



Qh (Vo) := "Ij^ft Q&h ( V o))> wh ere Q(S) := tr(S 2 ) - l(tr S) 2 . 

(1.1) 



Test procedures based on (1.1) enjoy a number of attractive features: 
(i) they are valid under arbitrary standardized radial densities gi, irrespec- 
tive of any moment assumptions, (ii) they are nevertheless (semiparametri- 
cally) efficient at some prespecified radial density f%, (iii) they exhibit sur- 
prisingly high asymptotic relative efficiencies with respect to classical Gaus- 
sian procedures under non-Gaussian gi's and, quite remarkably, (iv) when 
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Gaussian (van der Waerden) scores are adopted, their ARE's with respect 
to the classical Gaussian tests [21, 22, 34, 35] are uniformly larger than one; 
see [38] for this extension of the celebrated Chernoff-Savage [5] result to 
shape matrices. 

These optimality properties, in fact, are all possessed by the noncentrality 
parameters of the noncentral chi-square asymptotic distributions, under lo- 
cal alternatives, of the rank-based test statistic under consideration. When 
the radial density, under such alternatives, is g\, these noncentrality pa- 
rameters are quadratic forms characterized by a symmetric positive definite 
matrix of the form J% (/i, g{) J fc ~ 1 (/i)T^ 1 (V), where Jk{fi,9i) is a cross- 
information quantity (cf. (2.7)) and depends on neither /i nor g\; see 
Proposition HP4.1. This matrix, for g\= fx, coincides with the efficient in- 
formation matrix l 7fc(/i)Y^ 1 (V) for V under fx- 

An immediate question which arises is whether such tests have any natu- 
ral counterparts in the context of point estimation. That is, can we construct 
estimators V( n ) for the shape matrix that match the performances of those 
rank-based tests, in the sense of (i) being root-n consistent under any ra- 
dial density gx, irrespective of any moment assumptions — in sharp contrast 
with the Gaussian estimators, which require finite second-order moments 
for consistency and finite fourth-order moments for asymptotic normality, 
(ii) being nevertheless (semiparametrically) efficient at some prespecified 
standardized radial density fx and (iii) exhibiting the same asymptotic rela- 
tive efficiencies, with respect to classical Gaussian estimators, including (iv) 
the Chernoff-Savage property of [38]? Such estimators would improve the 
performance of the existing ones that satisfy the consistency requirement (i) , 
such as Tyler's [45] celebrated affine-equivariant estimator of shape (scatter, 

in Tyler's terminology) or the estimator of shape based on the Oja 

signs developed in [36]. These estimators are indeed root-n consistent under 
extremely general conditions (second-order moments, however, are required 
in [36]), but they are not efficient. 

The answer, as we shall see, is positive and the estimators achieving the 
required performances are i?-estimators based on the same concepts of mul- 
tivariate ranks and signs as the test statistics (1.1). 

1.3. R- estimation. The derivation of such i?-estimators, however, is by 
no means straightforward. Traditional i?-estimators are defined (and com- 
puted) via the minimization of some rank-based objective function; see 
[1, 19, 20, 24, 26] or the review paper by Draper [6]. In the present con- 
text, this approach, in connection with (1.1), leads to the definition of an 
i?-estimator as 

(1.2) ^argminQ/^V) = argmin(tr(S} 1 (V)) - ±(tr S fl (V)) 2 ) , 
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that is, as the value of V minimizing the sum of squared deviations of the 
k eigenvalues of the rank-based matrix S / x (V) from their arithmetic mean. 

This "argmin" definition is intuitively quite appealing. However, from a 
practical point of view, its implementation is numerically costly when the 
dimension of the parameter is high [a shape parameter has k(k + l)/2 — 1 
components]. The same definition is hardly more convenient from a theoreti- 
cal point of view: as a function of ranks, the objective function V i— > Qf ± (V) 



is discontinuous and its monotonicity/convexity properties are all but obvi- 
ous, so root-n consistency remains a nontrivial issue. 

Instead, therefore, we suggest a rank-based adaptation of Le Cam's one- 
step construction of locally asymptotically optimal estimators. A version, 
Ax^(V), measurable with respect to the ranks and signs associated with V, 

of the semiparametrically efficient (at V and /i) central sequence for shape 
can be constructed [see (HP4.1) or (2.6)]; this central sequence is distribu- 
tion-free with asymptotic covariance matrix t 7fc(/i)T^ 1 (V). The /i-score 
version of our iZ-estimator, in vech form (that is, stacking the upper-diagonal 
elements), is then defined as 



where is Tyler's estimator of scatter and a* is a consistent estima- 

tor of the cross-information quantity Jk{fii9i) [the problem of estimating 
Jk{fii9i) is discussed in Section 4]. The resulting vi" is a genuine ^-estim- 
ator since the one-step correction in (1.3) only depends on Tyler's and 
the corresponding ranks Ri and signs Uj. Moreover, it is asymptotically 
equivalent to a random matrix (depending on the actual g\) which is mea- 
surable with respect to the ranks and signs associated with the "true" value 
of V. And if (1.2) admits a root-n consistent sequence of solutions, this se- 
quence of solutions and the one-step definition of Vj" are asymptotically 
equivalent. 

The main objective of this paper is to show that V^™ , as defined in (1.3), 

indeed satisfies the properties listed under (i)-(iv) which are required of a 
semiparametrically efficient iZ-estimator. 

1.4. Outline of the paper. The outline of the paper is as follows. In Sec- 
tion 2, we recall the main definitions related to elliptical symmetry, local 
asymptotic normality and the relation between ranks and signs on one hand 
and semiparametric efficiency on the other; whenever possible, we refer to 
HP for reasons of brevity. Postponing to Section 4 the delicate problem 



(1.3) vech( V}" ; ) := vech(Vf ; ) + n 




) 
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of choosing a consistent estimator a* for Jk{fi,9i), Section 3 deals with 
the derivation and asymptotic properties of the one-step i?-estimator (1.3) 
based on such arbitrary a*. Section 4 is entirely devoted to the estimation 
of J7jfc(/i, 31). We start, in Section 4.1, with a review of the various solutions 
that have been considered in the literature, explaining why they fail to be 
fully convincing. Sections 4.2 and 4.3 then propose an original, more sophis- 
ticated (yet easily implementable) method inspired by local maximum like- 
lihood ideas. The resulting i?-estimators enjoy all the asymptotic properties 
expected from i2-estimation and, moreover, yield surprisingly high ARE's 
with respect to the existing methods: see Table 1. These estimators, how- 
ever, remain unsatisfactory on one count: for fixed sample size n, they are 
not affine-equi variant. They are, nevertheless, equivariant in a weak asymp- 
totic sense, as shown in Section 5. A numerical study (Section 6) confirms 
the excellent performance of the method. The Appendix collects technical 
proofs. 

2. Semiparametric efficiency under elliptical symmetry. 

2.1. Uniform local asymptotic normality. Let X^ n ^ := (X^ , . . . , X^T )', 

(n) 

n G N, be a triangular array of fc-dimensional observations. Let P e a2 v/ x 

denote the distribution of X^™) under the assumption that the X^ 's are 
i.i.d. with the elliptical density / 0cr2 y/ described in Section HP1.2 [which 
we refer to for details, as well as for a precise definition of the parameters 
9, a, V and the parameter spaces and Vk, the radial distribution 
functions Fx, the distances d\ n \o ,~V) , the ranks R[ n \o,V) and the signs 

(0,V)]. Our objective is the estimation of V under unspecified 6, a 2 
and fx. 

The relevant statistical experiment involves the nonparametric family 
(2.1) pW : = |J 7> } := [j {Pg Wl |0 G R k , a > 0, V € V k }, 

where fx ranges over the set .Fa of standardized radial densities satisfying 
Assumptions (A1)-(A2) in HP. The main technical tool is the uniform local 

asymptotic normality (ULAN), with respect to # := (6', a 2 , (vech V)')', of 

(n) 

the families ■ This ULAN property is stated and proved in Section HP2, 
which we refer to for the definitions of the score functions (ff i; ip^ and 

and for the explicit forms of the central sequences (1?) and information 
matrices (■&). 

The block-diagonal structure of Tj 1 (i?) and ULAN imply that substitut- 
ing (in principle, after adequate discretization) a root-n consistent estimator 
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- (n) 

= 6 for the unknown location has no influence, asymptotically, on the 

(n) 

V-part A^.g of the central sequence. Hence, optimal inference about V can 
be based, without any loss of (asymptotic) efficiency, on Aj", 3 (0, a 2 , V), as if 

9 were the actual location parameter. This actually follows from the asymp- 
totic linearity property of Section A.l. Therefore, in the derivation of theo- 
retical results, we may tacitly assume, without loss of generality, that 6 = 0. 
The notation P$ V;A , <#°(V), v\ n) (V), A^V.V), T fl (a 2 ,V), etc. will 
be used in an obvious way instead of Poo- 2 v-/i' (0>V), U- n ^(0,V), 
A^. 3 (0, cr 2 , V), rj i;3 (0,cj 2 , V), etc. Experiment (2.1) now takes the form 

(2.2)7*0:= U U U^:= U U ^V;* I V G H}. 

/i g^a fi e^A ^>o /i e^ A ^>o 

Although any root-ra consistent estimator 6 could be used, we suggest adopt- 
ing the multivariate affine-equivariant median introduced by Hettmansperger 
and Randies [18] which is itself a "sign-based" estimator. The multivariate 
signs to be considered, then, are the (0, V)'s and the ranks to be con- 
sidered are those of the d\ n \e,Vy S . 

2.2. Semiparametric efficiency, ranks and signs. The partition (2.2) of 
into a collection of parametric subexperiments all indexed by 

V and <7 2 , induces a semiparametric structure, where V is the parameter 
of interest, while (<r 2 ,/i) plays the role of a nuisance. Except for the un- 
avoidable loss of efficiency resulting from the presence of this nuisance, we 
would like our estimators to be optimal, that is, to reach semiparametric 
efficiency bounds, either at some prespecified radial density f\ or at any 
density belonging to some class JF* of radial densities. 

The semiparametric efficiency bound at f\ is provided by the so-called 
efficient information matrix (see Section HP3.1), 



Jk(fi) 

(2.3) 



r ^ v ) : =#T% M ^ v "^ 1/2 



2 

I fe 2 + K k — -J k 



(v ® 2) -l/2 M / fc 



--: Mf i)rf(V); 

we refer to Section HP1.4 for a definition of the matrices V® 2 , K. k , 3 k and 
Mfc, as well as for those of J k and which we will use later on. This 
information matrix (2.3) is the asymptotic covariance (under shape matrix 
V and density /i) of the efficient central sequence 

(2.4) Af\v):= 1 -n^ 2 M k (V® 2 rV 2 J ± k f ^ vec(U 4 U$ 



a a 
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(see Section HP3.1) which, like rl(V), does not depend on a (hence the 
notation). An estimator V"( n ) of V is semiparametrically efficient at (<r 2 ,/i) 
iff the asymptotic distribution under v . . of n 1 / 2 vech(V( n ) -V) is the 
same as that of (r^(V)) -1 Aj[ ft) (V), that is, iff, under P$ v ./i> 

(2.5) n^vairvW -V) -^Af(0, (r^(V)) -1 ). 

The difference between T ^(a 2 ,~V) and r^(V) quantifies the loss of infor- 
mation on V which is due to the non-specification of (<r 2 ,/i). It should be 
emphasized that, whereas this loss depends on the definition of shape (that 
is, on the arbitrary choice of the normalization V\\ = 1), the semiparametric 
information bound does not; see Sections HP3.1, HP3.2, [14] and [39] for 
details. 

A general result by Hallin and Werker [17] suggests that, in case 



(i) for all fi £ JF A and a > 0, the sequence of parametric subexperiments 
^2-^ I see (2-2)] is ULAN with central sequence A^(<7 2 ,V) and infor- 
mation matrix r/ 1 (o" 2 ,V) and 

(ii) for all V £ Vk and n £ N, the nonparametric subexperiment 7>y : = 
{Pji vfi\ a > 0' /i ^ -^a} is generated by a group of transformations 
£y with maximal invariant cr-field By , 

then the projection E[a£V 2 , V)| B^] of A^(o- 2 ,V) onto yields 
a distribution-free version of the semiparametrically efficient central se- 
quence (2.4). 

In the present context, this double structure exists: condition (i) is an 
immediate consequence of Proposition HP2.1 and the generating groups 

(n) 

Q-y are the groups of order-preserving radial transformations described in 
Section HP4.1, which admit the ranks R{ = R[ n \v) of the distances d[ n \\) 
and the multivariate signs Uj = U$ (V) as maximal invariants. Moreover, 
E[Ax^ (cr 2 ,V) | ... ,.R n ,Ui, U n ] is asymptotically equivalent to 

>W\ — 1-1/2™. nr®2x-l/2 T x STnrf R 



A^(V) := ^M^r^jt ^K fl vec(U i U9 

< 2 - 6 ) „ in) 

= \n-V*M k {V^)-V^\K h (-^J vec(U t U^) - ^vec(I fc ) 



(see Lemma HP4.1), with Kf^u) :=cpf 1 (F 1 1 (u))F l 1 (u) and exact center- 
ings mf:=±EUK h (i/(n + l)). 
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The properties of (V) are summarized in Proposition 2.1 below. For 
any g x € Ta, define I/ ljfll (V) := Jk(fl,9l)Yk 1 (V), where 

(2.7) Jk{fi,9i)-= C K fl {u)K gi {u)du 

Jo 

(a cross-information quantity); the notation Gy., (p gi is used in an obvious 
way. Note that Jk(fiji) = Jk(fi) so that T* fufi (V) reduces to T* fl (V). 

Proposition 2.1. For any / € Ta, the rank-based random vector A^(V) 

(i) is distribution-free under {P^ -y. gi I °~ > 0, 9i G J~}, where T denotes 
the class of all possible standardized radial densities; 

(ii) is asymptotically equivalent, in P^' v -probability for any g\ G T , 

to 

(2.8) Afi(V) := In-V2 Mfc (V^)-V2j- £ % (d lk (£)) vec^Uj), 

hence, in Pjjy, ^-probability, to the semiparametrically efficient {at f\, for 
any a) central sequence for shape (2.4); 

(hi) is asymptotically normal under {P^J v . |ct > ®,9i G ^} mean 
zero and covariance matrix T*^ (V) ; 

(iv) zs asymptotically normal under P^2 v+n -i/2 v . 9l ! mean T*^ fll (V) x 

vech(v) and covariance matrix r^(V) /or any symmetric matrix v stzc/i i/iaf 
I'll = 0, any <r > and any gi G ^a/ 

(v) satisfies, under P^ v . gi , as n —> oo ; i/ie asymptotic linearity property 

(2.9) ^(V + n^vW) - A^(V) = -r} iigi (V)vech(v(™)) + o P (l) 

/or any bounded sequence of symmetric matrices such that v^j = 0, any 
a > and any g\ G ■ 

Proof. Part (i): distribution- freeness readily follows from the distribution- 
freeness, under ellipticity, of the ranks (V) and the signs uJ n) (V) with 
respect to which A^ (V) is measurable. Part (ii) is covered by Lemma HP4.1. 

Parts (iii)-(iv) are established in the proof of Proposition HP4.1. Part (v) 
follows from the more general result given in Proposition A.l (see Ap- 
pendix A.l). □ 
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3. Optimal one-step i?-estimation of shape. Tyler's celebrated estimator 
of shape, , was introduced by Tyler [45] based on the very simple idea 
that if X is elliptical with location 6, then its shape V is entirely character- 
ized by the fact that U(0, V) := V -1 / 2 (X - e)/\\V~ 1 / 2 (X - 0)|| is centered, 

with covariance (l/fc)I&. Accordingly, is defined as the unique shape 

matrix satisfying I £f =1 U, (n) (0, V)(U 4 (n) (0, V))' = \l k . 

Denote by a discretized version of Vy Such discretizations, which 
turn root-n consistent preliminary estimators into uniformly root-n consis- 
tent ones (see, e.g., Lemma 4.4 in [30] for a typical use), are quite standard 
in Le Cam's one-step construction of estimators (see [31]), and several of 
them, characterized by a # subscript, will appear in the sequel. Denoting 
by \x\ the smallest integer larger than or equal to x and by Cq an arbitrary 

in) 

positive constant that does not depend on n, the discretized shape can 

be obtained, for instance, by mapping each entry v[- ^ (1, 1) of 

onto Vj^L : = Cq 1 sign(t>^)n~ 1 / 2 \n 1 / 2 Co\v^' |] . In practice (where n = no is 
fixed), such discretization is not required (as Co can be arbitrarily large) and 
actually makes little sense, as one can always decide to start discretization 
at n = no + 1; see Section 4.3 for practical implementation. 

Since A^ (V) is a version of the efficient central sequence for shape, Le 

o 

Cam's classical one-step method suggests estimating vech(V) by means of 
(3.1) vech(V^) := vek(V^) + -" 1/2 (r} 1 , 9l (V^))- 1 A^(V^). 

Such an estimator is semiparametrically efficient at Vf* , in the sense of 
(2.5). Indeed, in view of Proposition 2.1 and the continuity of V i— » ffl (V), 

nV2 V e ch( vW - V) = nV2 V ek(V^ - V) + (r} iigi (V^ ) ))- 1 A^(V^) 

= n 1 / 2 vech(vi n) -V) + (r} ii9l (vW))- 1 

x ( Ag' (V) - r /ligi (V) nV2 vech(V^ - V)) + o P (l) 

(3.2) =(r} ii9i (v))~ 1 A( ? ; ) (v) + p(i) 

(3.3) =(r} ii9i (v))- 1 A^ ) i (v) + p(i) 

under v . i , as n — > oo, where application to Aj™ (V^ ) of the asymptotic 

linearity property (2.9) is made possible, as usual, by the local discreteness of 
. The asymptotic representation (3.3) implies, for g\ = /i, the efficiency 
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of whereas (3.2), by providing for an asymptotic representation 

as a signed-rank-measurable quantity, justifies its status as an fi-estimator. 

A major problem, unfortunately, is that (3.1), via gi (V^ ), involves 
the unknown cross-information quantity J7fe(/i,<?i) defined in (2.7); Vj™^ 
is, therefore, just a pseudo-estimator which cannot be computed from the 
observations. In order to obtain a genuine estimator, V say, a consistent 

estimator a* must clearly be substituted for J7jfc(/i,<?i). This estimation of 
Jk{fii9i) is absolutely crucial in several respects since it not only explic- 
itly enters the definition of the one-step estimator, but also characterizes 
its asymptotic covariance. However, obtaining a consistent estimator a* of 
J7jfc(/i, gi) — the expectation of a function that depends on the unknown un- 
derlying gi — is a delicate problem. Accordingly, we defer the discussion of 
this issue to Section 4, where, after a review of the various methods available 
in the literature, we present an original method inspired by local maximum 
likelihood ideas. 

— <n) 

Therefore, in the present section, we define the /i-score .R-estimator V 

as the value of resulting from substituting into (3.1) an arbitrary con- 

— (n) 

sistent estimator a* for the unknown ^Tk(fii9i)- Up to discretization, V 
thus is defined as in (1.3). Irrespective of the choice of a*, the resulting one- 
step ^-estimators V are asymptotically equivalent (under V^ n ') to the 

pseudo-estimator V^L and, hence, also to the signed rank statistics (3.2) 

based on the "genuine ranks." Proposition 3.1 summarizes the main proper- 
ties of these estimators: (i) they are asymptotically equivalent to a function 
of the genuine ranks and signs, they are asymptotically normal, and their 
covariance matrix is the inverse of the covariance matrix characterizing the 
local powers of the optimal rank tests derived in HP, (ii) when based on 
/i-scores, they are semiparametrically efficient at radial density fx, (iii) for 
finite n, they can be expressed as a linear combination of the Tyler shape 
matrix and a rank-based shape matrix involving the Tyler ranks and signs, 
(iv) their asymptotic covariance matrix, under any elliptical density, is pro- 
portional to the asymptotic covariance matrices of the Tyler and Gaussian 
ML estimators. The proportionality constant, which can be considered as a 
measure of asymptotic relative efficiency, is provided in (v). In order to ob- 
tain a simpler expression for the asymptotic covariance matrix of vec( V^) 

(cf. 3.8), we define Q fc (V) := [k(k + 2)]~ 1 M' k T k (V)M k . As shown in the 
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proof of Lemma HP3.1 (with defined in Section HP 1.4), 
(3.4) T fc (V) = k(k + 2) N fc Q fe (V)N' fc . 

Proposition 3.1. Let fa and g\ belong to T^. Then 
(i) under P^ v . , as n — > oo, 



(3.5) n 1 /2 V e ch( v ( ; i ) # _ V ) = (T} ugi (V))" 1 a£>(V) + o P (l) 



(3-6) =(r} ii9i (v))^ 1 A^ i (v) + OP (i) 

(3-7) -A+M(0,(J k (fi)/j£(fi,9i))r k (V)) 
or, in terms o/vecV, 

(3.8) nVVecC^ -V) -±>M(0, (k(k + 2)J k {fa)/ J fc 2 (/i,9i))Qfc(V)); 
(ii) V fag is semiparametrically efficient at {P^v/i I ^ > Oi V G Vfe}/ 

(iii) 

(3.9) 

+ (^(*&>n) *&/<*&>.. 
/or a// n, w/iere := ( V J° ) , witfi 

(3.10) wW(V):=V^[lf % (^)uW(V)uW'(V) V 1 ^ 



L i=l 



and a* is the consistent estimator of J k (fa, 9i) entering the construction of 

Wn) 

)Lh#> 

(iv) the Gaussian ML estimator is V { g n) := S (n) /(S (n) )n with 

n 

S (n) := (n - l)" 1 £(X< - X)(X* - X)'; 

i=i 

provided that the kurtosis coefficient K k (g\) := {kE k {g{)) / {{k + 2)Df,(g\)) — 1 
[where we let E k (g\) := f Q (G^(u)) 4: du and D k (g\) := f (G^(u)) 2 du] is 
finite, then under P^" v . , 



■■<n 

V^ecCV^ - V) ^A/"(0, (1 + « fc (^i))Qfc(V)) as n - oo; 
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(v) the ARE (i.e., the inverse ratio of asymptotic variances) under 
P^Vgi' where g\ is such that «fc(<?i) < oo (resp., without any moment as- 
sumption on gi), of V with respect to Vg^ (resp., with respect to V^) 

1 + Kfc(gl) Jg(/l.gl) / 1 Jglft.glK 

18 k{k+2) J k (h) { reS P-> W J k {h) >■ 

PROOF. See Appendix (Section A. 2). □ 



Note that the ARE's in part (v) of the proposition are unambiguously 
defined, despite the multivariate setting, as the asymptotic covariance ma- 
trices of (the vec versions of) V^#, and all are proportional to 

Qfc(V). Their relative performances can thus be described by a single num- 
ber, a fact that was already observed in [44] (see also [33]); the situation is 
entirely different for covariance matrices, where two numbers are required 
[36, 37, 43]. 

These ARE's coincide with those obtained in HP for the problem of 
testing V = Vo (see Proposition HP4.2). An immediate corollary is that 

the Chernoff-Savage result of [38] also applies here: the ARE's of the van 

-~-{n) 

der Waerden (Gaussian-score) versions V v dw# of our i?-estimators (K^ = 

^JT 1 , where stands for the chi-square distribution function with k degrees 

of freedom — see Section HP4.2) with respect to the Gaussian estimator Vg 1 ^ 
are uniformly larger than one (and equal to one only at the multinormal) ; 

(n) 

the Pitman-inadmissibility of Vg follows. 

Table 1 provides some numerical values, under various Student (t v ) and 
normal (J\f) radial densities gi, of the ARE's in Proposition 3.1(v); for details 
on elliptical Student densities, see Section HP1.2. Note that under Student 

densities with four degrees of freedom or less, the ARE of V with respect 

to "Vg is infinite since n 1 / 2 (V^ — V) is not even Op(l). Also, note that 

in) 

the limits as u — > of the ARE's under t u , with respect to Tyler's , of 

- — {n) ~~~~{ n ) 

any "V Uo # (the i?-estimator associated with t UQ scores) and V v dw# are 

relatively modest and strictly less than one; see column to m Table 1 for 
numerical values. In fact, 

(n) k(k + v + 2) 
hmAREUV „ 0# /V T ] = {k + m + VQ) < 1 



and 



hm ARE 4j/ [ V^dw# /V^] = ^ < L 
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Table 1 

ARE's of the rank-based estimators Vo.s# , "V~3# , Vio# and V vt jw# (associated with 

to.5, tz, tio and Gaussian scores, respectively) with respect to Tyler's and, in 

parentheses, with respect to the Gaussian estimator Vj , under k-variate Student 
densities (with v degrees of freedom, v = 0.5, 3, 10,), along with the limiting values 
obtained for v — > and ^ — * oo (i/ie multinormal case), for k — 2, 3, 4, 6 and 10. 



Underlying density 



"^TX> 

Yo.5# 



-00 

v 3 # 



~00 



-00 

V v< jw# 



fe 


t 






to 


5 




t 


3 




t 


10 






M 


2 


0.900 


'oo) 


1 


111 


[oo) 


1 


246 


(oo) 


1 


280 


(0 


853) 


1 


296 


(0.648) 


3 


0.943 


[oo) 


1 


061 


[oo) 


1 


145 


(oo) 


1 


173 


(0 


939) 


1 


189 


(0.713) 


4 


0.963( 


oo) 


1 


038 


[oo) 


1 


098 


(oo) 


1 


121 


(0 


996) 


1 


136 


(0.757) 


6 


0.981 


'oo) 


1 


020 


[oo) 


1 


054 


(oo) 


1 


070 


(1 


070) 


1 


083 


(0.813) 


10 


0.992 


[oo) 


1 


008 


[oo) 


1 


024 


(oo) 


1 


034 


(1 


149) 


1 


044 


(0.870) 


2 


0.700 


[oo) 





969 


[oo) 


1 


429 


(oo) 


1 


651 


(1 


101) 


1 


792 


(0.896) 


3 


0.800 


oo) 





972 


[oo) 


1 


250 


(oo) 


1 


400 


(1 


120) 


1 


507 


(0.904) 


4 


0.857( 


oc ) 





977 


[oo) 


1 


667 


(oo) 


1 


278 


(1 


136) 


1 


366 


(0.911) 


6 


0.917 


[oo) 





985 


'oo) 


1 


091 


(oo) 


1 


162 


(1 


162) 


1 


229 


(0.921) 


10 


0.962 


[oo) 





992 


'oo) 


1 


040 


(oo) 


1 


078 


(1 


198) 


1 


123 


(0.936) 


2 


0.583 


[oo) 





829 


[oo) 


1 


376 


(oo) 


1 


714 


(1 


143) 


1 


961 


(0.980) 


3 


0.692 


[oo) 





861 


[oo) 


1 


212 


(oo) 


1 


444 


(1 


156) 


1 


633 


(0.979) 


4 


0.762( 


do) 





887 


{°°) 


1 


136 


(oo) 


1 


313 


(1 


167) 


1 


468 


(0.979) 


6 


0.844 


'oo) 





921 


[oo) 


1 


070 


(oo) 


1 


185 


(1 


185) 


1 


304 


(0.978) 


10 


0.917 


'oo) 





955 


{°°) 


1 


027 


(oo) 


1 


091 


(1 


212) 


1 


174 


(0.978) 


2 


0.500 


'oo) 





720 


[oo) 


1 


280 


(oo) 


1 


681 


(1 


120) 


2 


000 


(1.000) 


3 


0.600 


'oo) 





757 


{°°) 


1 


130 


(oo) 


1 


415 


(1 


132) 


1 


667 


(1.000) 


4 


0.667 


[oo) 





786 


[oo) 


1 


063 


(oo) 


1 


285 


(1 


142) 


1 


500 


(1.000) 


6 


0.750 


[oo) 





829 


[oo) 


1 


005 


(oo) 


1 


159 


(1 


159) 


1 


333 


(1.000) 


10 


0.833 


[oo) 





877 


[oo) 





973 


(oo) 


1 


067 


(1 


186) 


1 


200 


(1.000) 



This can be explained by the fact that, roughly speaking, "V^ is optimal 
at to-" I n more rigorous terms, we have that, for any fixed n, 

(3.11) - = o(l) P (n) -a.s., as v 0. 

Indeed, the scores K v associated with the fc-dimensional Student t v are 
K v (u) = k(k + v)G'j^ v (u) / {y + kG^ v (u)), u S (0,1), where G^^ stands for 
the Fisher-Snedecor distribution function with k and v degrees of freedom. 
It is easily checked that Gj, l u {u)/v — > oo as v — > so that lim^o K v (u) = k 

for all u € (0, 1). It follows (with obvious notation) that — = o(l), 

-a.s., as v — > 0. This, in view of (3.9), implies (3.11). Similarly, it can 

-~-(n) 

be shown that (using obvious notation) for all fixed n and u, (x^.) — 
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Vy (xi) is o(l) as k — > 00 along any sequence (x^, fe = 2, 3, ...), where 

Xfc = (xfci, . . . ,Xfc n ) is an n-tuple of vectors in M fe ; here, for k> n, Vj? (x&) 
can be taken as any solution of Tyler's M-equation. This explains the fact 

that for all fixed v, the ARE of V u # with respect to Vj^ goes to 1 as 

k — > 00. Incidentally, this also holds for the van der Waerden version of our 
estimators: as the dimension k of the observation space goes to infinity, the 
information contained in the radii d{ becomes negligible when compared 
with that contained in the directions Uj. 

-— {n) 

4. Estimation of cross-information coefficients. Our estimators V 

thus far, have only been defined up to the choice of a consistent estimator a* 
of the unknown cross-information quantity J7fe(/i,<?i) defined in (2.7). In this 
section, we first review the various methods available in the literature for 
estimating J7fc(/i,<7i) and then present an original method which relies on a 
local maximum likelihood argument. 

4.1. A brief review of the literature. The problem of estimating the cross- 
information coefficient i7fc(/i,5i) has always been around in /^-estimation 
and probably explains why it has never been as popular as rank tests in 
applications. Simple consistent estimators of cross-information coefficients 
(the definition of which depends on the problem under study) have been 
proposed by Lehmann [32] and Sen [42] for one- and two-sample location 
problems; these estimators are based on comparisons of confidence interval 
lengths, a method involving the arbitrary choice of a confidence level (1 — a) 
which has quite an impact on the final result. 

Another simple method can be obtained from the asymptotic linearity 
property of rank statistics (see [2, 29] or [25], page 321 for univariate location 
and regression). This method extends quite easily to the present context via 
the asymptotic linearity property (2.9). The latter indeed implies that for 
all /1, <7i € J- \ and any k x k symmetric matrix v such that v\\ = 0, 

^W (V W +n -i/2 v) _ a£ } (vJ?) = A^V + n-^v) - A^(V) + 0P (1) 

= -Jfc(/i,5i)T A : 1 (V)vech(v)+ p(l), 
under P^ v , as n — ► 00. Thus, for any v, 

(4.1) a» := || A^CVW +n-V 2 v) _ A^\vf)\\/\\^\vf)y^H V )\\ 

is a consistent estimate, under P^y fll > of ■Jk{fi,9i)- This method, however, 
is likely to suffer the same weaknesses as the univariate traditional idea; in 
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particular, these "naive" estimators involve the arbitrary choice of a "small" 
perturbation of the parameter [the choice of a particular v in (4.1) is indeed 
as good/bad as that of 2v, 3v, etc.). Theory again provides no guidelines 
for this choice which, unfortunately, has a dramatic impact on the output. 

More elaborate approaches involve a kernel estimate of g\ and, hence, 
cannot be expected to perform well under small and moderate sample sizes. 
Such kernel methods have been considered, for Wilcoxon scores, by [41] 
(see also [3, 4, 7] and, in a more general setting, in Section 4.5 of [27]. 
They also require arbitrary choices (window width and kernel or, as in 
[27], the choice of the order a of an empirical quantile) for which universal 
recommendation seems hardly possible (see [28] for an empirical investi- 
gation). Moreover, estimating the actual underlying density is somewhat 
incompatible with the group-invariance spirit of the rank-based approach: 
if, indeed, the unknown density g\ is eventually to be estimated by some 
gi , then why not simply adopt a more traditional estimated-score approach 
based on the asymptotic reconstruction, via A^| n \ of the efficient central 
sequence A*!™)? 

4.2. An original (local likelihood) method: consistency and efficiency. A 
more sophisticated way of dealing with the estimation of c7fc(/i,<7i) can be 
obtained by further exploiting the ULAN structure of the model. The basic 
intuition is that of solving a local likelihood equation. Consistency, however, 
requires somewhat confusing discretization steps which, as usual, are needed 
in formal proofs only. Therefore, we provide two descriptions of the method: 

this section carefully covers the details of discretization and establishes the 

-~{n) 

consistency of the proposed estimator (hence, that of the resulting V f 1 #), 

while Section 4.3 below, where discretization is skipped, can be used for 
practical implementation. 

Consider the sequence of (random) half-lines, 

vf = Vf{\f- *$<yf)) = {vech( yfym\P e n € N, 

with equation 

vech( Vf # (/?)):= vech(V f ) + n^/ST* (vf ) a£> ( V J l) ) 
(4.2) 

= vech(V^) +(3k(k + 2)N k [I k 2 - (vec V^e'^] vec(Wj^), 

where e k 2 i stands for the first vector of the canonical basis in R fc2 and 
Wj™^ := Wf^(V^ ); the last equality is obtained exactly as in the proof of 
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Proposition 3.1(iii). Each value of [5 defines on Vv' a sequence of root-n con- 
sistent estimators V^^(/?) of V; one of them, namely V^ t (j7^r 1 (/i, g±)), 

coincides with V^™^ in (3.1) and is efficient at [actually, an estimator 
V( ra ) is efficient iff VW - = o P (n- 1 / 2 ) under vf^}. 

However, these estimators Vj™^(/3) are not locally discrete since the mul- 
tivariate signs v\ n ^ in W^™^ are not discretized (even though evaluated 
at ); therefore, we discretize them further by discretizing W^™^: let 
be the k x k matrix obtained by mapping each entry of Wy£# 

onto w^jtft '■= c 1 " 1 sign('u;^)n~ 1 / 2 fa 1 / 2 ^^™^!] > where c\ > is some arbi- 
trarily large constant. Replacing (4.2) (but keeping the same notation for 
the sake of simplicity) with 

ve ° ch ( Y/i#(&)) : = vech(V^ n) ) + p e k(k + 2)N fc [I fc2 - (vec V^e^J vec( W^ # ) 

(4 ' 3) =:vech(V^) + ^- 1/2 /?£ T fc (V^)A^(V^), f £ N, 

where (3g := i/c2, with some other arbitrary constant C2 > yields root-n 
consistent estimators Vy™^(/^) that are locally discrete, in the sense that 

the number of possible values of vech( Vj™^(/^)) in balls with 0(n~ 1 / 2 ) ra- 

o 

dius centered at vech(V) is bounded as n — > oo. Still for simplicity, we keep 
the notation T>^ for this new sequence T>^ (V^ ; Aj™^(V^)) of fully- 

discretized half-lines. For any £ € N, Vj™^(/3^) can again serve as the pre- 
liminary estimator in a rank-based one-step procedure: letting 

veWy^(/3^)):=vek^ 
vech(V^(^; J^ih'di))) is such that 

(4.4) vech(V^(^;.7 fc " 1 (/ 1 , 9l ))) - vech(V^) = 0p (n- 1 /2 ) 

under P^ v - 91 - However, vech( Vy^(/%; J^ l {fi, 9\))) still cannot be com- 
puted from the observations. 
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Denote by ux> the unit vector along (corresponding to 's natural 
orientation half-line) and define 

(4.5) t :=min{ieN |/> # (AO := u^T(V^(^)) A^(V^(/3,)) < 0}, 

£~ := £ + — 1 and /J 1 * 1 := /3^±. The integers £^ are random; in order for 
Vf^ifi^) to remain root-n consistent and locally discrete, it is sufficient 

to check that £ is Op(l). This implies that for any e > 0, there exist in- 
tegers L £ and N e such that for all n> N e , the minimization in (4.5) with 
probability larger than 1 — e only runs over the finite set I G {1, . . . ,L £ } 
(equivalently, over the finite set /? 6 {Pi, ■ ■ ■ ,Pl s })- In order to show this, 
let us assume that £^ is not Op(l). Then there exists e > and a sequence 

nj | oo such that for all L € N and some a 2 , V and gi, P^y- [£~ > L]> s. 

Pythagoras' Theorem then implies that for L > C2 l 7 fe ~ 1 (/i, gi), with P^v-^r 
probability larger than e, 

||vech(V^(/? L ;J'- 1 (/ 1 , 9l )))-vech(V^)|| 
>||vech(V^(/3 L ))-vech(V^)|| 



n 



- 1/2 (c 2 - 1 L-J fe - 1 (/ 1)5l ))||Y fe (V^ ) )A^(V^ ) 



which contradicts the fact that (4.4) holds for £ = L. Thus, £^ are Op(l) 
and y^-ff.{P^) can also serve as initial estimators in a one-step strategy. 

— {n) 

The final step in the construction of our estimator Vj^, then, is a 

"fine tuning" step which consists of selecting an intermediate point between 
P~ and P + . This intermediate value, as we shall see, turns out to con- 
sistently estimate J^ l {fi,gi)- Denote by 7T+ (S) the projection onto T>^ 

of vech(V^(/3 ± ;^)) and let k { ±\8) := \\ir ( ±\6) - vedi(v5j°)||. Note that 

S i — ► 7r_ (5) [resp., 5 i— » 71+ (<5)] is "P^-a.e. continuous and strictly mono- 
tone increasing (resp., decreasing). Therefore, there exists a unique 5* such 

that ir_(S*) = 71+ (5*). The proposed i?-estimator of V is the shape matrix 
V characterized by vech(V := n± (5*). 

Let us show, to conclude, that 7r+^(5*) — vech(V^L) = op(n~ 1 / 2 ) under 

Either we have ^ (J^\fi,gi)) < ^+\j^Hfi,gi)) and 
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or ^\j k 7Hh,gi))>^\jkHfugi)) and 

4\Jk\fi,9i))<n i ±Hn<^\j k - 1 (fi, 9 i)). 

In both cases, tt^\s*) is in the interval [^-\jj~ 1 (fi,gi)),*%\jj~ 1 (fi,gi))\. 

Now, both 7r^( l 7 fc _1 (/i, <7i)) and ( t ^£" 1 (/i, <7i)) are efficient estimators 
satisfying (3.2) and (3.3). Indeed, from Pythagoras' Theorem, 

||7ri ri) (J fe " 1 (/ 1 , ffl ))-vech(Y^)|| 

< ||vech(V^(/3, ± ;^7 1 (/i,<7i))) -vech(V^)||= 0p (n- 1 /2 ) 

under Vg\\ Therefore, as a convex linear combination of Tr_(jr X {f\,g\)) 
and 7r^ l ^( l 7 fc _1 (/i, <7i)), vech(V j x #) =7r^(5*) is also an efficient estimator 

satisfying (3.2) and (3.3) and, contrary to n±\j' j r 1 (fi, gi)), it is computable 
from the sample. Now, clearly, 

a% := (/3J)- 1 := [n 1//2 1 1 ir^ (6*)- ve°ch( ) 1 1 / 1 1 T fc ( ) A^ ( ) 1 1 ] ~ 

(4-6) 

and (j7'fc(/i)/(a^ ; ) 2 )T/ c (V j x #) yield consistent (under Vg^) estimators of 
Jk(fi,gi) and consistent (under V^ 1 ) estimators the asymptotic covariance 
matrix of vech( V f x #), respectively. 



4.3. An original {local likelihood) method: practical implementation. As 
usual, the discretization technique which complicates the proofs of asymp- 
totic results and obscures the definition of the estimator makes little sense 
in practice, where n is fixed. Discretization in the previous sections was 
achieved in three steps: discretization of Tyler's into (based on Co), 

discretization of A^V^) into A^(V^) (based on ci) and discretiza- 

-~-(n) — (n) 

tion of (3 into 0t (based on C2). The "undiscretized version" V f x of V f x # 

corresponds to arbitrarily large values of these three discretization constants, 
leaving Vj^ and A^™ unchanged and bringing (for the sample size at hand) 

P + and f3~ so close to each other that the final tuning [involving the so- 
lution 5* of ir_(6) = tt+\6)] becomes numerically meaningless. Alterna- 

— (n) 

tively, denoting by V f x #{c) the estimator associated with the discretization 
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— (n) — (n) 

constants c := (co, c\, C2), we have V / x := lim^oo V /!#(c), where c — > 00 
means that Cj — > 00 for i = 0, 1, 2. 

■ — ■ (n) — (n) 

This practical implementation V j : of V can be obtained more di- 
rectly as follows. Letting 

vech( y£> (/?)):= vech( v£° ) + n~ l / 2 pT k (vf 1 } ) a£> (V#° ) , P G R+, 

[the undiscretized version of vech(Vj"L(/?^))], consider the "P^-a.e. piece- 
wise continuous function 

P~hQ3) :=(A^(V? ) )) / T fc (vf ) )T fc (V^(/?))AW(V^(/?)), 

(4-7) 

and put ,9* :=mf{p>0\h(p) < 0}, :=/?*-0and /?*+ :=/3* + 0. The ma- 
trices V^(/3*~) and V^(/3* + ) are clearly the "undiscretized counterparts" 

of and V^(/3+), respectively. However, P ^ VjfiP) 

being con- 
tinuous, Vl^(/3*~) = V f ™ The estimator proposed in Section 4.2 lies 
between Vj™^(/?~) and vi™^.(/3 + ). Accordingly, the i?-estimator we are 

proposing in practice is V fa :=Y^\p*) =V^\p* ± ); a* := pro- 
vides the corresponding estimator of J7fe(/i,<7i), the "undiscretized" version 
of (4.6). 

Let us stress, however, that all asymptotic properties — including asymp- 
totic optimality — are properties of the discretized estimators V whereas 
noting can be said about the asymptotics of the practical implementation 

5. Asymptotic affine-equivariance. An estimator of the shape ma- 
trix V is said to be (strictly, that is, for any fixed n) affine-equivariant iff 
for any invertible k x k matrix M and any fc-vector a, 

(5.1) vW(M,a) = (MVWM')/(MV( n )M')n, 

where V"( n )(M,a) denotes the value of the statistic V( n ) computed from 

(n) 

the transformed sample MXi + a, . . . , MX n + a. Both Tyler's and 

(n) 

the Gaussian estimator Vg are affine-equivariant. Unfortunately, the final 

(n) 

estimators V f 1 proposed in Section 4.3 are not. 
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The question arises as to whether V ^ is at least asymptotically affine- 

— (™) ~ 

equivariant, that is, whether V f 1 is asymptotically equivalent to some 

strictly affine-equivariant sequence (not necessarily a sequence of estima- 
tors): for all practical purposes, a sequence of pseudo-estimators, or simply 
a sequence of random shape matrices, would be fine. Closer inspection of this 

idea, however, reveals a major conceptual problem. Indeed, recall that all 

— (n) 

asymptotic results belong to the discretized estimators V^#, while nothing 

-~-(n) ~ 
can be said about the asymptotics of V a : a definition of asymptotic equiv- 

~ — (n) 

ariance relying on the asymptotic behavior of V f 1 is thus totally ineffective. 
Therefore, we propose the following, slightly weaker, definition. Denote by 

:={sW(xW)|m€N} and T< n > := {T^(XW) | m G N}, n G N, 

two countable sequences of X^™) -measurable random vectors or matrices 
such that the a.s. limits := linw+ooS&^X^) and T< n ) := lim^ooT^^xW) 
exist for all fixed n. Then if 

(i) and TW are asymptotically equivalent, meaning that for all m 

(or, more generally, for m large enough), Sm'(X^) — Tm'(X^) = 
op(n -1 / 2 ) as n — ► oo, and if 

(ii) is strictly equivariant, 

we may consider that inherits, under approximate or asymptotic form, 
the equivariance property of S( n ); we say that is weakly asymptotically 
equivariant. 

In order to show that the proposed estimators V ^ := lim^oo V / x #(c) 
are weakly asymptotically affine-equivariant, consider the class 7~( n ) : = 

— (n) 

{ V / 1 #(c m )|m G N}, where the sequence Cm — {pmfii Cm,lj Cm,2) i s such that 

lim m _^oo c m> i = oo, i = 0, 1, 2 and let us construct a class such that condi- 
tions (i) and (ii) for weak asymptotic equivariance are satisfied. Incidentally, 
note that a choice of the form := { Vx"L(co )Tn )|m G N} (with co, m — ► oo), 

where V^"^(co) denotes the pseudo-estimator defined in (3.1), is not suitable 

since the corresponding practical implementation vi 7 ^ := hm^-^oo Vj"^(co) 
is not strictly affine-equivariant. 

Inspired by V /^'s representation (3.9) as a linear combination of 

and the rank-based shape matrix W^™L defined in (3.10), consider now the 
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shape pseudo-estimators 

M Y^ = V^(c ):=Ba/(B^) 11 

with B^L := (1 — if^O V^ + ~^ 2 \ W^L, where en is the constant 

v Jkifi,9i) J # Jk(fi,9i) u 
(n) 

used in the discretization of Tyler's . Although, due to discretization, 
neither nor V^™^ are affine-equivariant for fixed n, the class := 

{^/™#( c o,m) | m G N} allows us to establish the weak asymptotic affine- 

— (n) 

equivariance of V j x , as shown in the following proposition. 

Proposition 5.1. Denote by := V^(c ) and by V / 1# := V/ 1# (c) 
the pseudo-estimator defined in (5.2) and the estimator defined in Sec- 
tion 4.2, respectively. Then (i) Vj"^ — V^/ x # = op(n~ 1 / 2 ) under p( n ) as 

?i — > oo and (ii) i/ie practical implementation := limm^oo Vj™^(co, m ) 

is strictly affine-equivariant. 

PROOF. See Section A. 3. □ 

Whether or not weak asymptotic equivariance is a satisfactory property 

— (n) 

is a matter of statistical taste. If it is, then this section shows that V ^ is 

the estimator to be used. The reader who feels that strict equivariance is an 

essential requirement is referred to [15], where it is shown that an adequate 

-~(n) -~» 
modification of V f 1 producing a strictly equivariant V f r is possible (at 

the price of some technicalities). Alternatively, it is shown in [8] that, under 
mild additional assumptions, an affine-equivariant i?-estimator of shape also 
can be obtained from iterating the mapping V i— » Wj™ (V)/(W^" (V))n, 

where Wj™ is defined in (3.10). 

6. Simulations. In this section, we conduct a Monte Carlo study in or- 
der to compare the finite-sample performances of the one-step i?-estimators 

V f r proposed in Section 4.3 (as well as those of their analogs using the 

Gaussian estimator Vg^ , instead of Tyler's Vj^ , as a preliminary estima- 
tor) to those of Vj^ and Vg themselves. We restrict our attention to the 



Table 2 

Empirical bias and mean-square error, under various bivariate t-, power-exponential and normal 
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estimators Vg and "V^' and the corresponding one-step R-estimators V0.5 , V3 , Vio and V v( jw 

replications; sample size is n = 50/n = 250. 



r(") 



-(n) 



~(n) 



densities, of the preliminary 
The simulation is based on 1000 



Preliminary 
estimator 



to.5 


V T 


0.0042/-0.0043 




0.0830/0.0207 


v w 


-0.6148/-0.0522 




310.8334/20.6781 




0.0034/-0.0024 




0.0771/0.0183 


vW 


0.0001/0.0021 




0.0798/0.0171 


v T 


0.0002/-0.0014 




0.0861/0.0216 




0.0014/0.0051 




0.1717/0.0219 


v (n) 

v T 


-0.0001/-0.0008 




0.0962/0.0250 




0.0037/0.0075 




0.1074/0.0254 


V rp 


0.0005/-0.0003 




0.1057/0.0281 




0.0034/0.0091 




0.1164/0.0284 



t 3 



BIAS (n = 50/ n = 250) 

tTo AT 



C3 



~(n) 
Vq.5 



v 3 



~(n) 



~(n) 
VvdW 



-0.0038/-0.0043 
0.0973/0.0219 

0.0012/-0.0005 
0.1782/0.0410 

-0.0004/-0.0031 
0.0806/0.0180 

0.0004/-0.0030 
0.0782/0.0178 

0.0019/-0.0017 
0.0680/0.0142 

0.0028/-0.0017 
0.0665/0.0140 

0.0025/-0.0015 
0.0681/-0.0261 
0.0034/-0.0014 
0.0672/0.0128 

0.0027/-0.0014 
0.0702/0.0124 

0.0035/-0.0013 
0.0696/0.0122 



-0.0016/0.0003 
0.0865/0.0062 
-0.0003/-0.0010 
0.0497/0.0058 

0.0004/-0.0006 
0.0619/0.0031 

-0.0005/-0.0006 
0.0612/0.0032 

0.0005/-0.0009 
0.0438/0.0024 

0.0002/-0.0009 
0.0433/0.0023 

0.0004/-0.0008 
0.0427/0.0029 

0.0001/-0.0008 
0.0419/0.0028 

0.0005/-0.0007 
0.0441/0.0036 

-0.0001/-0.0007 
0.0435/0.0036 



0.0006/-0.0030 
0.0895/0.0024 

-0.0058/0.0005 
0.0375/0.0024 

-0.0006/-0.0019 
0.0674/0.0030 

-0.0007/-0.0019 
0.0671/0.0032 

-0.0024/-0.0004 
0.0444/0.0028 

-0.0021/-0.0004 
0.0442/0.0030 

-0.0036/0.0001 
0.0395/0.0026 

-0.0031/0.0001 
0.0398/0.0028 

-0.0044/0.0003 
0.0387/0.0025 

-0.0041/0.0004 
0.0392/0.0026 



0.0067/0.0070 
0.1118/0.0201 
0.0025/0.0021 
0.0484/0.0041 

0.0039/0.0043 
0.0821/0.0115 
0.0033/0.0043 
0.0820/0.0116 

0.0023/0.0022 
0.0533/0.0047 
0.0023/0.0023 
0.0531/0.0043 

0.0023/0.0014 
0.0441/0.0032 
0.0023/0.0016 
0.0440/0.0032 

0.0024/0.0011 
0.0404/0.0026 
0.0024/0.0013 
0.0402/0.0026 



-0.0070/-0.0023 
0.0906/0.0072 

-0.0024/-0.0021 
0.0308/0.0006 

-0.0030/-0.0026 
0.0664/0.0037 

-0.0036/-0.0026 
0.0661/0.0037 

-0.0017/-0.0022 
0.0338/0.0006 

-0.0019/-0.0021 
0.0336/0.0006 

-0.0019/-0.0021 

0.0253/0.0000 
-0.0019/-0.0020 
0.0250/-0.0000 

-0.0024/-0.0020 
0.0217/-0.0000 

-0.0022/-0.0019 
0.0211/-0.0001 
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MSE (n = 50/n =250) 









*3 






e 3 


e 5 




v (n) 

T 
V (n) 


0.0410/0.0083 
0.2009/0.0392 
298.8463/11.3416 
80,313,350/42,948 


0.0407/0.0081 
0.2467/0.0357 
0.1033/0.0329 
0.7141/0.2358 


0.0408/0.0075 
0.2192/0.0337 
0.0265/0.0050 
0.1247/0.0211 


0.0404/0.0075 
0.2311/0.0369 
0.0183/0.0038 
0.0941/0.0175 


0.0444/0.0080 
0.2163/0.0337 
0.0155/0.0028 
0.0624/0.0115 


0.0423/0.0085 
0.2031/0.0320 
0.0138/0.0029 
0.0617/0.0109 


Vo.5 


v< n) 

i 


0.0368/0.0075 
0.1862/0.0339 
0.1152/0.0278 
0.2700/0.0566 


0.0328/0.0065 
0.1879/0.0285 
0.0337/0.0065 
0.1852/0.0284 


0.0312/0.0058 
0.1629/0.0258 
0.0308/0.0057 
0.1614/0.0258 


0.0307/0.0058 
0.1701/0.0282 
0.0309/0.0058 
0.1686/0.0281 


0.0320/0.0057 
0.1425/0.0233 
0.0318/0.0057 
0.1416/0.0233 


0.0296/0.0061 
0.1411/0.0223 
0.0294/0.0061 
0.1398/0.0223 




v (n) 

v T 

V (n) 


0.0419/0.0090 
0.2239/0.0371 
0.1184/0.0295 
5.6092/0.0598 


0.0290/0.0057 
0.1546/0.0247 
0.0296/0.0058 
0.1537/0.0247 


0.0238/0.0044 
0.1169/0.0199 
0.0235/0.0044 
0.1162/0.0199 


0.0208/0.0042 
0.1138/0.0198 
0.0209/0.0042 
0.1132/0.0197 


0.0178/0.0031 
0.0715/0.0127 
0.0175/0.0031 
0.0709/0.0128 


0.0149/0.0030 
0.0676/0.0112 
0.0146/0.0030 
0.0668/0.0112 


Vio 


v (n) 

v T 

V (n) 


0.0490/0.0106 
0.2701/0.0428 
0.1307/0.0339 
0.3796/0.0662 


0.0300/0.0060 
0.1579/1.5539 
0.0306/0.0060 
0.1583/0.0253 


0.0234/0.0043 
0.1117/0.0191 
0.0232/0.0043 
0.1108/0.0191 


0.0191/0.0039 
0.1005/0.0180 
0.0190/0.0039 
0.1006/0.0180 


0.0147/0.0025 
0.0568/0.0102 
0.0143/0.0025 
0.0562/0.0101 


0.0118/0.0022 
0.0519/0.0084 
0.0114/0.0022 
0.0511/0.0083 


V V( JW 


v (n) 

v T 

V (n) 


0.0552/0.0121 
0.3134/0.0486 
0.1406/0.0377 
0.4237/0.0726 


0.0316/0.0064 
0.1652/0.0267 
0.0322/0.0064 
0.1665/0.0266 


0.0238/0.0044 
0.1129/0.0192 
0.0238/0.0044 
0.1121/0.0192 


0.0187/0.0039 
0.0964/0.0176 
0.0185/0.0039 
0.0967/0.0175 


0.0135/0.0022 
0.0518/0.0092 
0.0131/0.0022 
0.0511/0.0092 


0.0106/0.0019 
0.0457/0.0073 
0.0102/0.0018 
0.0449/0.0072 
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bivariate spherical case (V = I2). We generated M = 1,000 samples of i.i.d. 
observations Xi, . . . , X n , with sizes n = 50 and n = 250, from the bivariate 
standard normal distribution (A/"), the Student distributions (to.5)) (^3) an d 
(£io) (with 0.5, 3 and 10 degrees of freedom) and the power-exponential 
distributions {e$) and (65) (with parameters 77 = 3 and 5); for details on 
power-exponential densities, see Section HP1.2. This choice of Student and 
power-exponential distributions allows for the consideration of heavier-than- 
normal and lighter-than-normal tail distributions, respectively. 

For each replication, we computed , and the '- and ~Vg - 

— (n) — (n) - — -in) ~~~~( n ) 

based one-step -R-estimators V v dw > V 0.5 , V 3 and V 10 , corresponding 

to semiparametric efficiency at Gaussian and Student densities with 0.5, 
3 and 10 degrees of freedom, respectively. In Table 2, we report, for each 
estimate V"( n )(Z) = {Yij l \l)), the two components of the average bias 

, M 1 M 

BIASW : = ]>>ch(vM(0 - V) = ^£(^(0,^(0 - 1)' 
1=1 1=1 

and the two components of the mean square error 

1 M 

MSEW : = -J2((V&(1)) 2 , (4 n) (0 - l) 2 )', 

i=l 

based on the M replications V( n )(/), I = 1, . . . , M. 

These simulations show that the proposed rank-based estimators behave 
remarkably well under all distributions under consideration and significantly 
improve on Tyler's estimator. They confirm the optimality of the Tyler- 
based /i-score i?-estimators under radial density / and essentially agree 
with the ARE rankings presented in Table 1. Also, the van der Waerden 
rank-based estimator (based on preliminary estimator Vj^ or Vg 1 ^) uni- 
formly dominates the parametric Gaussian estimator Vg 1 ^ and performs 
equally well in the normal case; this dominance, which is observed under 
both lighter-than-normal and heavier-than-normal tail distributions, pro- 
vides an empirical validation of the Chernoff-Savage result of [38] . 

The behavior of one-step rank-based estimators does not seem to depend 
much on the preliminary estimator used (Vj^ or V^), confirming that the 
influence of the preliminary estimator is asymptotically nil. More surprising 

(n) 

is the fact that i?-estimators based on Vg behave reasonably well under 

heavy tails (under to. 5)1 although itself is not even root-n consistent 
there (which explains its total collapse under to, 5). Quite remarkably, these 
conclusions are equally valid for small (n = 50) as for large (n = 250) sample 
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sizes. This is another non-negligible advantage of our method over kernel- 
based ones (see Section 4.1), which typically require much larger sample 
sizes. 

APPENDIX 

A.l. Local asymptotic linearity. Rather than Proposition 2. 1 (v) , we prove 
in this section a more general asymptotic linearity result in which both the 
location and the shape parameters are locally perturbed. 

Proposition A.l. For any bounded sequences of k- dimensional vec- 
tors t( n ) and symmetric matrices satisfying = and for any g\ € 
J- A, the central sequence A^(0,V) satisfies, under Pg 1 ^ v-gi' asn^oo, 
the asymptotic linearity property 

(0 + n" 1 / V") , V + n" 1 / 2 v< n > ) - A ( , n) (0, V) 

r^j J 1 r^j j 1 

(A.l) 

= -r} li5l (V)vech(vW)+ 0p (l). 

The proof of Proposition A.l relies on a series of lemmas. In this section, 
we let 6 n :=e + n" 1 / 2 ^ and V n := V + n _1 /V n ). Accordingly, let Z° := 
V- 1 /2( X , -0), d9 :=\\Z% U° := Z°/d?, Zf := (V")" 1 ^ - n ), df : = 
\\Zf\\ and Uf :=Zf/df. We begin with the following preliminary result: 



Lemma A.l. For all i, as n— >oo, under P 

(i) \df-d° i \=o P (l) and 

(ii) ||U?-U°|| = 0p (l). 



PROOF. First, note that, defining ||M||£ :=sup|| x || =1 ||Mx||, 
\\Z? - Z°|| < IKV*) -1 / 2 ^ - 6P)|| + IKCV") -1 ^ - V^ 1 / 2 )(X J - 0)|| 

^n-V^IKV^-V^II^HtWll + IKvn)-!^ _ v -l/2||^|| V l/2||^0 

<C(n)(l + d° t ) 

for some positive sequence C(n), with C(n) = o(l) asm oo. Now, since for 
all 5 > 0, P^v^ [C(™)K 9 ) a > <5] = o(l) as n -»■ oo (a = -1, 0, 1), we obtain 
that ||Zf - Z°|| and \\Zf - Z?||/d? are o P (l) under v ^ as n ^ oo. 
The result follows since (i) |d? - d?| < ||Z™ - Z°|| and'(ii) ||Uf - U°|| < 
|(l/d? - l/rf?)|||Z?|| + \\Zf - Z?||/rf? < 2 ||Z? - Z?||/d?. □ 
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Proof of Proposition A. 1 . We first consider the following truncation 
of the score function Kf 1 . For all £ G No, define 



Kf{u) :=K f A- £ \l\u-j j/ [|<u < f] +K fl (u)I [l<u ^ 1 _ j] 



1 



1 

1- -„ 



U In 



f<w<i-4]' 



where I a denotes the indicator function of A. Since u i— > Kf^u) is continu- 
ous, the functions u i— > K^(u) are also continuous on (0, 1). It follows that 

the truncated scores K^A are bounded for all I. Clearly, it can be safely 
assumed that is a monotone increasing function (rather than the differ- 
ence of two monotone increasing functions) so that there exists some L such 

that \K$(u)\ < \K h {u)\ for all u G (0,1) and all £>L. 

(n) 

We have to prove that, under ^ v . x > as n ~* °°i 



(A.2) a£>(0»,V») - A^(0 ) V) + J fe (/ 1 , 5l )T- 1 (V)vech(vW; 



» 



is o P (l). Proposition 2.1(h) shows that A^ } (0,V) - A*^(0, V) is o P (l) 

as n — > oo, under the same sequence of hypotheses. Similarly, the difference 
AjJ (0™, V") - A^ (0™, V") is op(l) as n -+ oo, under (hence, 



*(ri) 



from contiguity, also under P^.2 v . 9i ). Consequently, (A.2) is asymptotically 
equivalent, under P^ 2 v . , to 



(A.3) (0\ V") - AJW (0, V) + J fc (/ l5 ffl )T fc - 1 (V)vech(v( 

Now, n-^J^vecELi^^fc^/a))^^'], under P^ |ff3)V n ;fll , is 
asymptotically normal as n — > oo, with mean zero and covariance matrix 
(k(k + 2))" 1 J fc (/ 1 )[I fc2 +K fc - fj fe ] so that 



-n 



" 1 / 2 M fc [((V"f 2 )~ 1/2 - (V® 2 )- 1 / 2 ! J^vec 



^if /l (G' lfc (df/ f r))U-Ur 



is op(l) as n — > oo under Pgn CT 2 v«-gi' as we ^ as un der P^2 v . gi (by conti- 
guity). Consequently, (A.3) is asymptotically equivalent, under P^2 v .„ , to 



CW := in- 1 / 2 M fc (V® 2 )~ 1 /2j± vec 



(A.4) 



l -V2 Mfc(V ®2 r l/ 2j ± vec 



Y,K fl (Gi k (d?/v))V?lJr 

n 



i=l 
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+ J fc (/i,ffi)T,T 1 (V)vech(vW) 
and we need only prove that C (n) =o P (l). Decompose C (n) into 



D 



(n;£) 



+ D 

,(n) 



(n;£) 



_|_ j^v'w _|_ R^'w ^ where, denoting by Eq expectation 



under P^ 2 v and denning J^(fr,gi) := Jq K y £ (u)K gi (u) du 



(Gi fc (d?/<r))U?U?' 



i=l 



in-^MfcCV* 2 )- 1 / 2 ^ 



^4f(G lfe (d°/a))U°U! 



i=l 



vec 



^Kjf(G lfc (dr/a))UrU 



i=l 



D 



-n 



-l/2 Mfc(V ®2 ) -l/2 J ± Eo 



vec 



j;^(G u (d?/<r))W 
Li=l 



+ ^ ) (/ 1 ; 5l )T7; 1 (V)vech(vW), 



-n 



-1/2 M (v ®2 r l/2j± 



x vec 



i=l 



R 



-n 



xvec 



^^(G^/a)) - Kf (G lfc (d?/ ff ))]U?U 



i=l 



and 



R 



:=(^(/i,5i)-J fe W (/i;ffi))Tr 1 (V)vech(vW). 



We prove that C^ n ^ = Op(l) (thus completing the proof of Proposition A.l) 
by establishing that D^ n '^ and D^ n '^ are op(l) under P^.2 v . 9l , as n — ► oo, 

for fixed ^ and that Rj™'^, R^ 1 '^ and Rg 1 '^ are op(l) under the same se- 
quence of hypotheses, as £ — > oo, uniformly in n. For the sake of convenience, 
these three results are treated separately (Lemmas A. 2, A. 3 and A. 4). □ 

Lemma A. 2. For any fixed I, Eo[||Di n;£) || 2 ] =o(l) as n — > oo . 



Lemma A. 3. For any 

as n — > oo . 
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Lemma A. 4. As t — > oc, uniformly in n, 

(i) Rj n/) iaop(l) under PJJa V!JiJ 

(ii) Rg 1 ' is op(l) under Pg a 2 v /or n sufficiently large, 

(iii) R^ o(1). 



Proof of Lemma A. 2. First, note that 



D 



(n;£) 



1 _ZL 



where lf''* j := vec [Kf> (G lk (df /a^VfVf - K^(G lk (4/a))V° i V° i l i 
1, . . . ,n, are i.i.d. Writing Varo for variances under Pi 2 V - > we have 

n 2-| 

ElTf'-EolTf"]] 

i=i 



EoIllD^H^Cn^Eo 



< Cn^ 1 tr 



Var 



£[ T f^-E [T 

i=i 



(A.5) 



as n 



= Ctr[Var [Tr ; ]]<CEo[||Ti ri 
and it only remains to be shown that 

E [||TS^ ) || 2 ]=E [||4f(G lfc K/ O -))vec[U?Ur] 

-4f(G lfc (d?/o-))vec[U;u;']|| 2 ] = o(l) 
-> oo. Noting that ||vec(uv')|| = ||u||||v||, we have 
||Kj?(G lfc KAr))vec [U?Uf] - < ) (G lfe (d?/ -))vec [U?U?']f 



<2\Kf{G lk {d^/a))-Kf{G lk {d\/a))\ 2 \\ve C [U?U 
+ 2 |tf<? (G lfc (d?/ -))| 2 ||vec [U?Uf - U?U?']|| 2 



TTTl'j ||2 



< C\Kf (G lk (d?£/a)) - K%(G u (diM)\' + C||U? 



(0, 



u 



II 2 

ill > 



for some constant C. Lemma A.l(i) and the continuity of °G\ k together 

imply that Kf{G lk {d^/a))-Kf(G lk {<P 1 /a))=o?{l), under P^ 2jV;5l , 
(£) 

as n — > oo. Since is bounded, this convergence to zero also holds in 
quadratic mean. Similarly, using Lemma A.l(ii) and the boundedness of 
and U™, we obtain that ||U" — U^H is o(l) in quadratic mean, as n — > oo, 



under P„ 



>(«) 



0,<7 2 ,V;gi 



. The convergence in (A.5) then follows. □ 
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B (n;i) ;= l n -l/2 Mfc(v « 2) -l/ 2jj L 



vcc 



^^(G lfe (^/a))U°Ur 



i=l 



one can show that, under Q , r , as n — > oo 



(A.6) 



Bi 



(n;£) C 



AA(0,E[(4f(C/))^]T^(V)) 



[throughout, £/ stands for a random variable uniformly distributed over 



(0, 1)]. Under the sequence of local alternatives PL: 



B (n-4) _^) (/i;5l)T -i ( v) V ech(vW) JV(0,E[(J^(tO) 2 ]T^(V)) 



r(i) 



(n) 

6»",o- 2 ,V";c/i' 



as n — > oo, 



Defining B^ } := in" ^MfcCV® 2 )-^^ E " =1 ^(^(dn/^juyu^'] 

0,cr 2 ,V;gi' 



it follows from ULAN that, under P a ' 2 , r . 



(A.7) 



^AA^E^f^Y^V)). 

(n) 



Now, from (A.6) and the fact that, under Pg^ 2 y.^i 



Jn;£) 



B 



(n;/) 



— EolBJ," - ' 1,1 ] = op(l) as n — > oo (Lemma A. 2), we obtain that 



(A- 



B 



(n;£) 



En[B 



0L Jr »2 



(n;£)i £ 



A^E^^riYr/^v)) 



as n — > oo 



under Pg^v-gi 1 Comparing (A.7) and (A. 8), it follows that 
E [B^ } ] + J fc W (/ 1 ; 5l )T- 1 (V)vech(vH) is o(l) as n- oo. □ 



D 



(n;<) 



We now complete the proof of Proposition A.l by proving Lemma A. 4. 
Proof of Lemma A. 4. (i) In view of the independence, under P^ 2 v - 9l > 



between the cR's and the U?'s, we obtain, for all 



n, 



E [||R^» 2 



C 



n — ; J 



i=l 



(A. 



xE [[vecU°Un^[vecU°U°']] 
C(fc-l) 



XEoII^^^/a))-^^^/^)] 



8=1 

g(fc-l) ^ 
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Now, Kj^(u) converges to Kf t {u) for all u£ (0,1). Also, since \Kl (u)\ 
is bounded by l-fif^u)!, for all £ > L, the integrand in (A. 9) is bounded 
(uniformly in t) by 4|Aj 1 (ti)| 2 which is integrable on (0,1). The Lebesgue 

dominated convergence theorem thus yields that EofHRj™'^ || 2 ] = o(l) as t — > 
c<3. This convergence is uniform in n, since the constant C in (A. 9) does not 
depend on n. 

(ii) The claim in part (ii) of the lemma is the same as in part (i), ex- 
cept that and U" replace cq and U°, respectively. Accordingly, part (ii) 
holds under pSw 2v . That it also holds under v .„ follows from 
Lemma 3.5 of [23]. 

(iii) Note that \Jk(fi,gi)-j!?\fi;gi)\ 2 = I ti(K h {u)-Kf(u))K gi (u)du\ 

<J k { 9l )ti\K h {u)-Kf{u)\ 2 du. Again, \K { £ (u) - K fl (u)\ 2 < 4\K fl (u)\ 2 

with Jq \Kj 1 (xi) | 2 du < co. Pointwise convergence of {Kf^) to A' implies that 

;gi) = o(l) as £ — > co. The result then follows from the 
boundedness of (v( n )). □ 

A. 2. Properties of ii-estimators. 

Proof of Proposition 3.1. (i) The asymptotic representations (3.5) 
and (3.6) are just restatements of (3.2) and (3.3), to which we refer for 
the proof. The convergence in (3.7) then readily results from part (iii) of 

Proposition 2.1. As for (3.8), it follows directly from the fact that vec( Vj™^ — 

V) = M' fc ve°ch( Vfy - V) and the definition of Q*(V). 

(ii) Semiparametric efficiency follows from the fact that Jk{f\,fi) = Jhi.fl) 
so that under P^ v .^, the asymptotic variance in (3.7) reduces to l 7fc(/i)~ 1 Tfc 
the inverse of the efficient information matrix r^V). 

(iii) From (3.4) and (3.1), [with Ri = i?f ) (V i # ) ) and U* = uf^V^)], 

vech(V^ ) = vech(V^) + ^^ ^(h^Ni *®<yf ) 
= + ^^ N ^ fc(V # ))((V # ))02rV2 



1=1 



if/J^vec^UD-^vec^) 



where we used the fact that (see Section 4.2 for the definition of e fc 2 jj 
Q fc (V)N' fc M fc = Q fc (V) 
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= [I k2 - (vecV)e' fc2;1 ][I fe2 + K fc ](V^ 2 )[I fe2 - (vec V)e' k2 J 
= [I k2 + K fc ](V^ 2 ) -2(V^ 2 )e fe2il (vecV) / 
- 2(vec V)e' fc2 l (V^ 2 ) + 2 (vec V) (vec V)'; 
see the proof of Lemma HP3.1. Routine algebra yields 

vech( ft) = vech(V^) + ^W fc2 - (vec V^W^m™)**) 1 * 2 



(A.10) 



x(iE^(^)vec (Ui UD 



= vech(V^) + ^^N fc [I fc2 - (vec V^efc, J vec(W^ ) 
= vech(V^) + ^^N fc vec(wW - (^ u v{)), 

o o 

which establishes the result since vech v = vech w if and only if v = w for 
all k x k symmetric matrices v = (Vy), w = (u>jj) such that v\\ = W\\. 

(iv) Due to the identification constraints, the population covariance ma- 
trix under Pa,, with finite second-order moments is not I] := cr 2 V, but 
7/E := k~ 1 a 2 Dk(gi)V- Provided that Kk(gi) < oo, the multivariate central 
limit theorem yields n 1 / 2 vec(S( n ) -77S) -=-> A/"(0, A), where 

a 4 -g fc (gi) rT i/ v ®2x , °^fc(gl)g|(gi) / V w uV 
fc(fc + 2) [ I fc 2 + K fc]( V ) + £2 (vec V) (vec V) . 

Now, applying Slutsky's lemma, we obtain, as n — > 00, under P^S v . ffl , 

n 1/2 vec(vjT ) - V) = -^[I fc2 - (vecV)e' fc2 1 j[n 1/2 vec(sW- J| S)] + o P (l) 



: jv(o,^j[I fc2 - (vecV)e' fc2 l ]A[I fc2 - (vec V)^]') ; 



where the covariance matrix, after lengthy but standard algebra, reduces to 
(1 + Kfc(<7i))Qfc(V), yielding the desired result; see also [36]. 

(v) The asymptotic covariance matrices of vec( V^™^) in (3.8) and vec(Vg n ^ 

in (iv) are proportional; ARE's with respect to "Vg in (v) follow directly as 
ratios of the corresponding proportionality factors. ARE's with respect to 
follow from the fact that in the normalization adopted [i.e., (V^)n = 

1], n 1//2 vec(V^ — V) is asymptotically normal with mean zero and covari- 
ance matrix ((k + 2)/fc)Qfc(V). □ 



32 M. HALLIN, H. OJA AND D. PAINDAVEINE 

A. 3. Asymptotic equivariance. 

Proof of Proposition 5.1. 
(i) We first prove that 

(A.ll) W^-V^ = P (n" 1 /2 )) 



under V {n \ as rw oo [recall that W}"^ := W)" ; (Vy J )]. To this end, define 

n 

T ( f f (V) ^-^(V® 2 ) 1 ^ 



(n) 

-^-Jvec^UO-^vec^; 



4n) i 

v v ; 

i=l 

[with i?j = R\ n ' '(V) and Uj = U- n ^(V)], which is asymptotically normal with 
mean zero and covariance matrix j7fe(/i)Hfe(V), where 

H fe (V) := (V^ 2 ) 1/2 [l fc2 + K fc - ^J fc ] (V® 2 )^. 

Proceeding exactly as in the proof of Proposition A.l, we obtain that for 
any bounded sequence v^ n ) of symmetric matrices such that = 0, the 
difference 

T^ ) (V + n- 1 /V w ))-T^(V) + i t 7 fe (/ 1) p 1 )H fe (V)(V® 2 )- 1 vec(vW) 

is op(l), under P^ygi' as n — > oo. The local discreteness of V^/ allows us 
to replace the nonrandom quantity V"( n ) = V + n~ 1//2 v( n ) with the random 
one (see, e.g., [30], Lemma 4.4), yielding 

1 
2 

under P^ v . 9i , as n — > oo. This establishes (A.ll) since 
x / 2 vec( Wf # - V f ) = t£> ( V^ n) ) + n l ' 2 k~ l {mf - k) vec( V^ n) ) 



T^(V^VT^ ) (V) + ^ fc (/ 1 , 5l )H fc (V)(V® 2 )~ 1 nV2 vec ( V W-V) = p(l), 



= n^k-^mf - k)vec(vf ) + T^(V) 

(A.12) 

--J fc (/ 1 , 5l )H fe (V)(V® 2 )-V/ 2 vec(V^ ) -V) 
+ op(l) 

(still under P^ v . ffl , as n — > oo) and since the square-integrability of 

over (0, 1) implies that mj^ — k = mj^ — Jq Kj^u) du = o{n~ 1 / 2 ) (see the 
proof of Proposition 3.2(i) in [16]). 
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Now, denoting by := V^(co) the pseudo-estimator denned in (3.1), 
it follows from (A.ll) that [letting b:=k(k + 2)J^ 1 (f 1 ,g 1 )] 

x [I fc2 - (vecVW)e' fe M]vec(W<^ - v£>) 

is op(n -1 / 2 ), under pw, as n — ► oo. This yields the result since, in Sec- 
tion 4.2, we proved that V^™^ — = op(n~ 1 / 2 ), under p( n ), as n — ► oo. 

(ii) If V"( n ) is strictly affine-equivariant [in the sense of (5.1)], then using 
the same notation as in Section 5, (V( n) (M, a)) 1 / 2 = dM(\^) l / 2 for some 
d > and some k x k orthogonal matrix O (see, e.g., [40]). The strict afhne- 
equivariance of the practical implementation = limm^oo Vj"^.(co,m) 

[which is based on vf° and wj^CvJ ) instead of and W^] fol- 
lows. □ 
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